Deep neural networks are learning models with a very high capacity andtherefore prone to over-fitting. Many regularization techniques such asDropout, DropConnect, and weight decay all attempt to solve the problem ofover-fitting by reducing the capacity of their respective models (Srivastava etal., 2014), (Wan et al., 2013), (Krogh & Hertz, 1992). In this paper weintroduce a new form of regularization that guides the learning problem in away that reduces over-fitting without sacrificing the capacity of the model.The mistakes that models make in early stages of training carry informationabout the learning problem. By adjusting the labels of the current epoch oftraining through a weighted average of the real labels, and an exponentialaverage of the past soft-targets we achieved a regularization scheme aspowerful as Dropout without necessarily reducing the capacity of the model, andsimplified the complexity of the learning problem. SoftTarget regularizationproved to be an effective tool in various neural network architectures.
展开▼
机译:深度神经网络是具有很高容量的学习模型,因此易于过度拟合。许多规范化技术(例如Dropout,DropConnect和权重衰减)都试图通过减少各自模型的容量来解决过度拟合的问题(Srivastava et al。,2014),(Wan et al。,2013),(Krogh&Hertz, 1992)。在本文中,我们引入一种新的正则化形式来指导学习问题,从而减少过度拟合而又不牺牲模型的能力。模型在训练的早期阶段所犯的错误带有关于学习问题的信息。通过以实际标签的加权平均值和过去的软目标的指数平均值来调整当前训练纪元的标签,我们获得了像Dropout一样强大的正则化方案,而不必减少模型的容量,并简化了学习的复杂性问题。经证明,SoftTarget正则化是各种神经网络体系结构中的有效工具。
展开▼